[1] 0.9640697
[1] 0.970621
Computing probabilities and quantiles from this important distribution
The normal (also known as ‘Gaussian’) is \(X\sim N(\mu,\sigma^2)\):
Its pdf:
\[ f(x)=\frac1{(2\pi\sigma^2)^{1/2}}e^{-\frac{(x-\mu)^2}{2\sigma^2}} \]
(Just like the binomial, you don’t need to know this formula for this class)
Computing probabilities with the normal requires computers
We will always write \(X\sim N(\mu,\sigma^2)\) for the normal
R syntax will be specified with \(\mu\) and \(\sigma = \sqrt{\sigma^2}\) instead
Example
If \(X \sim N(5,9)\), then the probability \(X\) is less than 4 is:
[1] 0.3694413
Continuing with \(X \sim N(5,9)\):
(that is, normal with expected value 5 and variance 9)
Now, let’s ask the opposite question!
For a fixed probability \(p\), what is the value \(x\) so that \(P(X \leq x) = p\)?
This value is called a quantile
Concept Check: What would each of the following codes produce?
[1] 3
[1] 0.2
Suppose \(X \sim N(-3.2, 100)\)
that is, normal with expected value -3.2 and variance 100
Probability: What is the probability \(X\) equals 0?
This is always zero!
Probability: What is the probability \(X\) is less than 0?
Values: at what value \(x\) does \(P(X\leq x) = .05\)?
Example
The scores on the SAT math section are \(X \sim N(520,3600)\)
Example
The scores on the SAT math section are \(X \sim N(520,3600)\)
Suppose again \(X \sim N(\mu, \sigma^2)\)
If we multiply/add constants \(a,b\) to \(X\) to form a new rv \(Y\):
\[ Y = aX + b \]
then \(Y \sim N(a\mu + b, a^2 \sigma^2)\)
(This should remind you of our \(E[X]\) and \(Var[X]\) results)
Example:
If \(X \sim N(150,100)\) then
\[ Y = 3X + 30 \sim N(480, 900) \]
We are measuring a machined part using a ruler on a .01” scale
If several people measure, they will get some random amount of measurement error:
\[ X \sim N(.875, .00005) \]
What is the probability that that a measurement will be within \(\pm\).01 centimeters?
(An inch is 2.54 cm)
\(Y = 2.54X\), therefore \(Y \sim N(2.54*.875, 2.54^2*.00005) = N(2.2225, 0.00032)\)
\[\begin{align} P(2.2225 - .01 \leq Y \leq 2.2225 + .01) & = \\ = P(2.2125 \leq Y \leq 2.2325) & \\ = F(2.2325) - F(2.2125) \end{align}\]
When \(\mu=0\) and \(\sigma=1\), we call it the standard normal
\[ f(x)=\frac1{\sqrt{2\pi}}e^{-x^2/2},\,\,-\infty<x<\infty. \]
(Once again, you don’t need to know this formula for this class)
The standard normal will be notated as: \(Z \sim N(0,1)\)
Going from a general normal to a standard normal is known as standardization
\[ X \sim N(\mu, \sigma^2) \longrightarrow Z = \frac{X - \mu}{\sigma} \sim N(0,1) \]
The opposite (actually, the inverse) is true
\[ Z \sim N(0, 1) \longrightarrow X = Z \sigma + \mu \sim N(\mu,\sigma^2) \]
(this is using the results in Transforming Normals)
The z-score can be computed:
\[ \text{z-score} = \frac{x - \mu}{\sigma} \]
(For a value \(x\) from a distribution with mean \(\mu\) and variance \(\sigma^2\))
Question: Which of the two values from two different distributions is more unusual?
Answer: Whichever z-score has larger magnitude
(that is, largest |z-score|)
Example: We want to compare olympic records in men’s and women’s sprinting. Which is more unusual?
Let’s look at a table:
| Category | Mens | Womens |
|---|---|---|
| average | 9.85 | 10.83 |
| variance | .0057 | .0049 |
| record | 9.63 | 10.61 |
Let’s compute the two z-scores:
The women’s result is more unusual
The z-score for men is -2.914 while for women it is -3.143
Another use of z-scores: what value \(x\) would be equally as unusual?
Answer: Choose the \(x\) so that the z-scores are equal
Example: What time for a men’s sprinter would be equivalent to the female record?
| Category | Mens | Womens |
|---|---|---|
| average | 9.85 | 10.83 |
| variance | .0057 | .0049 |
| record | 9.63 | 10.61 |
We found the female z-score = \(\frac{10.61 - 10.83}{\sqrt{.0049}}\) = -3.143
To find the equivalent time, we need to unstandardize \[ \text{z-score}*\sigma + \mu = x \]
Here, use
The male record would need to be 9.613 seconds to be as unusual as the female record.
Values from normal distributions contain predictable amounts of probability
This is the empirical rule:
\(P( |Z| \leq 1) = 0.68\)
\(P( |Z| \leq 2) = 0.95\)
\(P( |Z| \leq 3) = 0.997\)
On exams, there will be a link to an applet: Probability applet
Note: This applet is totally optional and unnecessary if you use R
Returning to SAT example:
[1] 0.9087888
[1] 625.0412
To use the applet
(here, 1.751 came from the applet by using the probability .96 from SAT example)
The normal plays a central role in probability and statistics
We will return to it later during sampling distributions…